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1. INTRODUCTION 

Phonology is described as an aspect of language dealing with rules for structuring and sequencing 
speech sound. In many languages such as American English, Chinese and Arabic, the acoustic characteristics 
of the vowels were well studied. This was not so for an under-resourced language such as Malay 
language [1]. Traditionally, the pronunciation of vowels in any language and Malay language was taught by 
referring to the International Phonetic Association (IPA) transcription sound chart. The documentation of the 
pronunciation of speech sounds was done by describing and distinguishing the vowel’s sound produced 
through impressionistic [2-8]. Impressionistic means traditional research methods without using any 
instrument in identifying and reproducing the language. In this case, the phonetic members rely on hearing, 
vision, and sensitivity to their own speech organs to review each articulation [9]. Thus, there were 
possibilities of researches expressing varying opinions about the nature of vowels sounds of the Malay 
language. This situation was caused by hearing that vary among researchers. 

The Malay language as described in the Encyclopedia Britannica [10] is a branch of 
the Austronesian language family, spoken by more than 33 million people of Malaysia, Indonesia, Southern 
Thailand, Singapore and Borneo [11]. The voice system of the Malay language varieties was shown to differ 


Journal homepage: http://ijere.iaescore.com 


Int J Eval & Res Educ. ISSN: 2252-8822 g 145 


greatly encompassing virtually all variations of the Austronesian languages [12-14]. A study comparing five 
Malay/ Indonesian dialects were done to understand the voice systems of Austronesian family language. 
The dialects are prescriptive Standard Indonesian (SI), three Malay varieties spoken in the Malay heartland of 
Sumatra, Basd Selangon (BS), Sarang Lan Malay (SL), Mudung Darat Malay (MD) and the Malay of the city 
of Kuching in Sarawak (KM) [12]. Of all the Malay dialects, the most important dialect that has formed the 
basis of standard Malay language and the official language of Indonesia was the dialect of the southern 
Malay Peninsula [10]. Thus, the need to further document the phonological aspect of the Malay spoken 
language and dialects was imperative. 

In the past decade, a number of studies on the Malay language vowels were conducted. Several 
investigations [15-18] were done on the phonetic properties of the six Malay vowels, /a/, /e/, /a/, /1/, /o/ and 
/u/. In [19, 20] employed image processing techniques on magnetic resonance imaging (MRI) to visualize the 
vocal tract in order to obtain dynamic articulatory parameters during production of the Malay vowels. In [15], 
four formant frequencies were used to analyze the sustained six vowels of Malay children aged between 7 
and 12 years old. The aim was to investigate the acoustical differences of the speech production across age 
groups and gender. The results showed that the formant frequencies of the females were generally higher 
than the males. However, there was no significant difference in the formant frequencies of most vowels 
across the age groups. The authors [16] then analyzed the fundamental frequency (Fo) and perturbation 
measures of the vowels across age groups and gender. There was no significant Fo difference between the 
vowels for both males and females. However, a significant difference is shown across the age groups. 
The perturbation measures, on the hand, showed no significant difference across the age groups or gender. 
The extraction of vowel’s fundamental frequency was also conducted on Malaysian Chinese young adults 
in [17]. As expected, the Malaysian Chinese females had significantly higher Fo than Malaysian Chinese 
males in all six vowels. There were also no significant differences in Fo across the vowels for each gender. 
In [18], Ting et al. extended the same research on Malaysian Malay young adults by extracting the Fo and 
perturbation measures from 6 sustained Malay vowels. They also concluded that there was no significant 
difference of Fo and perturbation measures in all vowels. The Malay females also have higher Fo compared to 
Malay males but no significant difference is found for perturbation measures across genders. However, 
comparisons on multiethnic showed that Fo varies between Malaysian Malay and other ethnic groups. 

Most phonological studies on Malay vowels were to compare the acoustical differences of speech 
production between genders and across age groups. While these works were important and valuable towards 
adding knowledge in understanding spoken Malay language, the documentation of the speech pronunciations 
of standard Malay and its dialects using vowel charts will form the fundamental basis for further research 
such as speech synthesis, speech education, speech rehabilitation and speech reproduction. This is the 
motivation of our work. Furthermore, previous studies only explored on the sound productions of six 
standard Malay vowels /a/, /e/, /a/, /1/, /o/ and /u/. However, [21] compiled a list of nine Malay language 
vowels which consists of six standard vowel phonemes /a/, /e/, /1/, /o/, /u/, ə/ and three allophones /9/, /g/, /3/ 
that were frequently used in the dialect put forward by [5]. On the other hand, [21] and [22] later argued that 
vowels uttered by speakers of some dialects were reduced to only two allophones /9/, /e/. Therefore, 
based on [21] and [22] arguments, we proposed to characterize the dynamic vowel pronunciations of eight 
Malay vowels /a/, /e/, /1/, /o/, /u/, /a/, /9/, /£/ using formant frequencies. The contributions of this paper are 
threefold: 1) The vowel features of two Malay allophones /9/, /e/ were documented using technology 
instrument and illustrated in the IPA vowel chart and 2) The first and second formant frequencies of eight 
Malay vowels were quantified and documented, and 3) Audio collections of all eight Malay vowels were 
acquired from spoken Malay standard and dialects of four states in Malaysia. 


2. RESEARCH METHOD 
2.1. Subjects 

The subjects were all native Malay speakers who use Malay language on daily basis. At the time of 
recording, all subjects have a complete articulatory tool with no history of speech disorder and were healthy. 
They were locals in their respective states and have been staying in the same district all their lives. For this 
paper, the subjects were consisting of four males and three females with age ranging from 40 to 60 years old. 


2.2. Equipments and procedures 

Unlike previous work that recorded the spoken vowels in a controlled enclosed environment, 
the recording of our subjects was done in open, outdoor environment. We selected our subjects and started 
a conversation with the subject at a place they were most comfortable such as their home, in a coffee shop 
and the market. The conversations were recorded from the spontaneous conversations using a Sony ICD- 
TX650 16GB slim digital voice recorder which responses to 95 — 20000 Hz and battery life for recording up 
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to 12 hours. The recorder was super slim, compact and lightweight for excellent recording quality and 
comfortable to be attached on the human body. The spoken words containing the vowels were selected from 
the conversations by a Malay language linguist and are shown in Table 1. 


Table 1. The Malay vowels dataset 


Vowels No of words Duration of words No of subjects 
a 153 1 minute 30 seconds 3 
e 71 40 seconds 3 
i 124 1 minute 8 seconds 3 
o 99 56 seconds 3 
u 124 1 minute 6 seconds 3 
a) 138 1 minute 28 seconds 3 
2 29 18 seconds 2 
€ 12 7 seconds 2 


In this research, seven subjects were used to record all the eight Malay vowels /a/, /e/, /1/, /o/, /u/, /3/, 
/9/, /e/. However, the vowels /9/ and /e/ were acquired from only six subjects that were able speak in dialects. 
One of the subjects can only speak in a standard Malay, thus only six vowels /a/, /e/, /1/, /o/, /u/, and /a/ were 
acquired from this subject. Table 1 concludes that the vowels /a/, /i/, /u/ and /9/ are the most frequently used 
in spoken Malay language, followed by /o/, /e/, /o/ and /é/. 

The collection of spoken words were pre-processed to remove the artefacts produced during 
recording. The spontaneous speech also suffered with filled pause and elongation that need to be determined 
and removed to preserve the quality of the spoken words [23-25]. The sampling frequency of the collected 
words is 16 kHz in the mono channel with 16-bit bit resolution. Framing was done to the collected speech by 
blocking the speech signal into frames of N samples. In our work, we used frame length of 20ms [26] with 
adjacent frames separated and shifted by 10ms [27]. Windowing was applied after framing to minimize 
the signal discontinuities at the beginning and ending of each frame. It pre-multiplies the signal with Hanning 
window that smoothly decreased to zero value at each start and end frame. Hanning window was chosen 
because it produced a smoother and accurate signal [28] compared to Hamming or Blackman functions. 
The final step of pre-processing was filtering to suppress interfering signals and reduced environmental 
noise. High pass filtering is done using Audacity with different cut-off values depending on the background 
noise. The vowels were annotated manually from the spoken words using speech analysis tool known as 
Praat [29] by a technical person by referring to end point evaluation [30-31] as a guidence. Figure 1 
illustrates the transcription of the speech words and vowels. 


apobile 





Total duration 90.628000 seconds 


Figure 1. Transcription of the vowels /a/ from three Malay words, /ape/, /apo/, and /apobile/ done using Praat 
by a technical person. The selection of words and groupings of the vowels were done by a linguist 


In Figure 1, the first row consists of the speech waveform. The second row illustrates 
the spectrogram of the speech waveform with the maximum frequency of 5000 Hz. The blue coloured line is 
the pitch of the speech waveform. For annotation and labelling, the third and fourth rows represent the first 
and second tiers (marked as | and 2) for labelling level. The first and second tier show the manual labelling 
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at word and vowel level, respectively. In this example, the vowel /a/ at the vowel level (second tier) was 
extracted and transcribed from three spoken words /ape/, /apo/, and /apobile/ at the word level (first tier). 
The beginning and ending of the vowel was determined based on three criteria: 1) a dramatic change in 
amplitude in the vowel’s waveform, 2) a change in the energy of the vowel’s formants accompanied by 
a change in complexity in the waveform indicating a loss of energy in F2 and F3, and 3) the onset of 
aperiodicity. A total of 750 vowels were extracted from the spoken words and were used to document 
the vowels pronunciations. As can be seen from Table 1, the highest number of vowel is for vowel /a/, 
followed by /9/, /i/ and /o/. These are the most commonly used vowels in spoken Malay language. 


2.3. Vowel feature extraction 

Formant extraction was done using Praat [32] for all the vowels. The standard format settings of 
Praat was used that were 5500 Hz of maximum formant frequency, five numbers of formants, 25 
milliseconds of window length, and a dynamic range of 30dB. An example of a spectrogram for vowel /a/ 
was illustrated in Figure 2. 





Figure 2. Gray-scale spectrogram and the formant frequencies of a vowel /a/ are 
demonstrated as an example. F1, F2, F3 and F4 are visualized 
as the dark bands’ spectrograms 


The formant frequencies (F1-F4) for each vowel were calculated using the middle point value of 
formants for each sample. In Figure 2, the formants can be seen in a wideband spectrogram as dark bands. 
Meanwhile, Figure 3 illustrates the four formant frequencies of all 153 vowels of /a/. Overall, the distribution 
of each formant of vowel /a/ was shown to be discriminative. However, there was a few overlapping of 
formant frequencies particularly for Fl with F2, and F2 with F3 occurred due to the close resemblance of 
vowel speech productions. The F1 and F2 were important in determining the quality of vowels and were 
frequently said to correspond to the open or close of mouth position and front or back dimensions of 
the mouth [33]. On the contrary, high formant frequencies of F3 and F4 represent high pitch sounds 
especially in singing. F3 and F4 were manipulated by lowering the larynx and elevating the tongue blade to 
enhance this part of the spectrum and make it heard above an orchestral accompaniment [34]. 


Formant frequency (Hz) 


Number of vowels /a/ 





Figure 3. Formant frequencies (F1-F4) for all 153 vowels of /a/ collected from 3 subjects. Despite the visibly 
distinct colour-coded clusters of the formants, few overlapping formants of the vowel /a/ can be seen. 


Formant characteristics of Malay vowels (Izzad Ramli) 


148 g ISSN: 2252-8822 


3. RESULTS AND DISCUSSION 

After the formant frequencies were extracted from all the eight vowels, the average formants (Hz) of 
each vowel were computed and tabulated in Table 2. As shown in the table, the average formants across all 
vowels are unique indicating the distinct pronunciations of each vowel. 


Table 2. Average formant frequencies (F1-F4) for all eight vowels 
4 


Fl F2 F3 F 
/al 811 1520 2496 3429 
/e/ 499 1908 2559 3337 
/i/ 432 2049 2763 3502 
/o/ 547 1186 23713 3359 
/u/ 477 1204 2462 3455 
/9/ 556 1550 2459 3384 
/9/ 665 1247 2657 3603 
/g/ 690 1979 2647 3535 


Previous work [27] showed that the formant frequencies of the vowels differ in the frequencies of the first 
two formants. F1 and F2 were also commonly acknowledged as the main carriers of information necessary 
for vowel identification [25]. The discriminative characteristic of F1 and F2 were also used by [35] to 
classify gender of Malay children aged between 7 to 12 years old. Therefore, further analysis of F1 and F2 
was deemed necessary in our work. 

The frequencies of F1 and F2 for all eight vowels (/a/, /e/, /1/, /u/, /3/, /0/, /3/, &/) are illustrated in 
Figure 4. The high degree of crowding among adjacent vowels appears for the vowels (/a/, /e/, /u/, /a/, /o/, /9/, 
¢/). Meanwhile, the vowel /i/ is clustered separately at the top left of the crowded vowel area. Even though 
the vowels (/a/, /e/, /u/, /a/, /o/, /o/, €/) were clustered close to each other, the group of vowels were still 
identifiable. As an example, the cluster of vowels /a/ and /u/ were highly populated at the bottom and top of 
the crowded area, respectively. The exact position of the Malay vowels can be determined based on 
the average formant frequencies F1 and F2 and is presented in the acoustic vowel diagram in Figure 5. 

The value of the first and second formant frequencies represent the physical movement of the mouth 
during vowel’s pronunciation. The first formant was to determine the height of the tongue body. If the vowel 
has a high first formant frequency, it indicated that the low tongue body was used to produce it. This type of 
vowel was called “low vowel’. Based on Figure 5, the vowel /a/ has the high value of the first formant, 
therefore it was categorized as low vowel and low tongue body was used to pronounce it. For “high vowel’, 
high tongue body was being used and this was shown by the low value of first formant frequency. For 
example, the vowel /i/ and /u/ have low first formant frequencies. Therefore, the vowels /i/ and /u/ were 
categorized as high vowels. The other vowels (/e/, /a/, /o/, /o/, /e/) were categorized as middle vowels 
between high and low vowel. 


N 
= 
ra) 
& 
© 
z 
in 
o 
e 
TG 
& 
° 
(9) 
w 
an 


First formant (Hz) 





Figure 4. Tabulation of the 8 vowels for 4 males and 3 females. The data is thinned of redundant data points 
for better clarity of the display 
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Figure 5. Vowel diagram of the Malay vowels spoken by 4 males and 4 females calculated using the average 
formant frequencies of F1 and F2. 


The second formant value was mostly to determine the frontness or backness of the tongue body. If 
the vowels have high second formant frequency’s value, it is pronounced at the front of the tongue body and 
categorized as “front vowel”. Meanwhile, if it has low second formant frequency value, the sound is created 
at the back of the tongue body and is categorized as “back vowel”. Based on Figure 5, the vowels (/1/, /e/, /é/) 
have high second formant frequencies, therefore these vowels were categorized as front vowels which were 
produced at the front of the tongue body. The vowels (/u/, /o/, /o/) with low second formant frequency were 
categorized as back vowels. Meanwhile, the vowels (/a/, /a/) located at the middle were categorized 
as central vowels. 

We further analysed the vowel diagram based on gender as shown in Figure 6. In general, 
the females have a wider range of the first and the second formant. The lowest average first formant of 
the females was the vowel /i/ at 423 Hz and the highest average was for vowel /a/ at 841 Hz. While for the 
males, the lowest average first formant was for vowel /i/ at 445Hz and the highest average was for vowel /a/ 
at 685Hz. The difference between the first and second formant for the females was 418 Hz, while for 
the males was 240 Hz. This indicates that the degree of openness and closeness of the mouth during vowel 
pronunciation was larger for females compared to males. The same phenomenon was seen for the second 
formant frequencies. The females have a larger range of average second formant frequencies from 1180Hz to 
2200Hz, while the males ranged from 700Hz to 1020Hz. The females showed the use of the front and back of 
the tongue more compared to males when pronouncing the vowels. It was also interesting to note that 
the vowel /i/ for male was pronounced more backness compared to vowel /e/, unlike in females where 
the vowel /i/ was more frontness than vowel /e/. The same occurrence was true for vowel /u/ and /o/ as can be 
seen in Figure 6. The vowel diagram of Figure 6 was mapped to the International Phonetic Alphabet (IPA) 
[36] to position the eight Malay vowels. The left side of the vowel diagram represented the portion of 
the mouth closer to the lips, and the right side represented the back of the mouth [37]. Vowel backness was 
a term referred to the position of the tongue during the articulation of a vowel relative to the back of 
the mouth. The top of the chart is the roof of the mouth and the bottom of the chart is the jaw. The completed 
vowels chart with the Malay vowels is shown in Figure 7. 
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1800 1600 1400 1200 1000 
Second formant (Hz) == _ Male (Present study) 
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Figure 6. Vowel’s diagram of the Malay vowels for males and females’ categories 
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Open 


Figure 7. Vowels chart of the Malay vowels on the IPA 


There were three main vowel features that can describe a vowel. The features were height, backness, 
and roundedness. The height and backness refer to the tongue position during vowel production. 
Roundedness refers to the lip’s shape. The IPA calls high vowels differently as “close” vowels, and low 
vowels at the bottom as “open” vowels. The vowels in the middle rows were called “mid vowels”. 
These terms described the state of the mouth during the vowel’s pronunciation. Based on Figure 7 only vowel 
/a/ was identified as an open vowel and the two close vowels were /i/, and /u/. The other vowels were 
identified as close-mid or open-mid vowels. For the backness feature, the vowels on the right-hand side of 
the chart were called “back” vowels. Those on the left-hand side were called “front” vowels. Those in 
the middle were called “central” vowels. In Malay language there were three front vowels /1/, /e/, /e/ and 
three back vowels /u/, /o/, /o/. The others were the central vowels. Upon closer observation of the vowels 
position, vowel /e/ was located closer to the front compared to the same vowel in the IPA chart. Based on 
the observation, we can say that the vowel /e/ when pronounced in Malay language was produced further up 
at the front of the mouth. Roundedness was the third major vowel feature and it described the lips and not 
the tongue. The front and central vowels were usually unrounded, and the back vowel is rounded. Therefore, 
the Malay language vowels /a/, /e/, /1/, /o/, /e/ were the unrounded vowels and vowels /o/, /u/, /3/ were 
the rounded vowels. We further quantified the eight vowels based on their average F1 and F2 and plot them 
over the vowel chart as (F1, F2). This is illustrated in Figure 8. 
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Figure 8. Vowels chart of the Malay vowels on the IPA with quantifications of first (F1) and second formant 
(F2) of each vowel. For example, vowel /1/ which is a high (close) vowel and pronounced at the front of 
the tongue has an average (F1, F2) of (432, 2049) 


Based on the overall average and standard deviation of the formant frequencies, we tabulated 
a general range of Fl and F2 for all the Malay vowels. The vowel height was represented using the first 
formant and its backness was represented using second formant. For example, if a vowel has an average first 
formant of 500Hz and average of second formant of 1200, it was declared as a close-mid and back vowel. 
Any uncategorized vowel may be classified using this proposed generalized range. Generalization of vowel 
height and vowel backness based on the average of first and second formants. A vowel that falls in the range 
of 246 to to 663 Hz for its average first formants and having an average second formants in the range of 1065 
to 2504Hz was declared as high (close) and front vowels as shown in Table 3. 
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Table 3. Generalization of vowel height and vowel backness based on the average of first 
and second formants 


Vowel Height F1 (Hz) Vowel Backness F2 (Hz) 
Close 246-663 Front 1065-2504 
Close-mid 370-676 Central 1196-1904 
Open-mid 362-884 Back 823-1610 
Open 658-964 


4. CONCLUSION 

This was the first initiative towards documenting all the eight vowels in Malay language using 
formant frequencies. In this paper, the Malay vowels of spontaneous speeches in Malaysia were investigated. 
The first and second formants frequencies of the eight vowels, (/a/, /e/, /1/, /u/, /a/, /o/, /o/, /e/) were analyzed 
to objectively measure the openness and closeness of the mouth, and the frontness and backness of 
the pronunciation. The vowel diagram of the eight Malay vowels was charted to highlight the differences and 
similarities between vowels category and genders. This was the main contribution of this paper. Based on 
the vowel chart, it showed that the Malay dialect vowels comply with the IPA standard. The mean F1 and F2 
for all eight vowels were documented for future references as research or as an educational tool for language 
learning. Compared to vowels produced by male and female, female have wider range of formant. However, 
further work needs to be done such as in-depth analysis of the formant frequencies in different dialect 
in Malaysia. 
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